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1 Crawling the web: Building domain-specific web coiiections for scientific digita! 

libraries: a meta-search enhanced focused cravi/lina method 
^ Jialun Qin, Yilu Zhou, Michael Chau 

June 2004 Proceedings of the 4th ACM/IEEE-CS joint conference on Digital libraries 
Publisher: ACM Press 

Full text available: ^,pd£214 J.4 K3} Additional Infornnation: ful] cJMipn, abstract, reierences, index JercDS 

Collecting domain-specific documents from the Web using focused crawlers has been 
considered one of the most important strategies to build digital libraries that serve the 
scientific community. However, because most focused crawlers use local search 
algorithms to traverse the Web space, they could be easily trapped within a limited sub- 
graph of the Web that surrounds the starting URLs and build domain-specific collections 
that are not comprehensive and diverse enough to scientists and researcher ... 



Keywords: digital libraries, domain-specific collection building, focused crawling, meta- 
search, web search algorithm 



2 jndMstry..session.3j„^ g 
engjne.fgr.know!ed^^ 

Eui-Hong Han, George Karypls, Doug Mewhort, Keith Hatchard 

November 2003 Proceedings of the twelfth international conference on Information 
and knowledge management 

Publisher: ACM Press 

Full text available: ^pdt(368 32 KB) Additional Information; full citation, abstract, references, index teriYis 

The explosive growth of available information sources and the resulting information 
overload pose several problems for users in many business organizations and educational 
institutions. First, searching through several information sources, one at a time, is a 
source of enormous frustration for users. Second, top-ranked documents in search results 
are frequently irrelevant to what users are interested in. To address these problems, we 
have developed ixmeta^", a powerful metasearch engine tha ... 

Keywords: clustering, collaboration, collection fusion, library automation, metasearch, 
personalization 




Models for metasearch 
http://portal.acm.org/results.cfm? coll=ACM&dl=ACMifeCFID=68607474&CFTOKEN=7 1 8 . 



2/15/06 



Results (page 1): title: +metasearch or +"meta search" 



Page 2 of 6 



4 



Javed A. Aslam, Mark Montague 

September 2001 Proceedings of the 24th annual international ACM SIGIR conference 
on Research and development in information retrieval 

Publisher: ACM Press 

Full text available- ^pd^M81 '^3 K31 Additional Information: fuJiMatlon, aMtr.act, references, citinQs, index 
^ ' terms 

Given the ranked lists of documents returned by multiple search engines in response to a 
given query, the problem ofmetasearchis to combine these lists in a way which optimizes 
the performance of the combination. This paper makes three contributions to the problem 
of metasearch: (1) We describe and investigate a metasearch model based on an optimal 
democratic voting procedure, the Borda Count; (2) we describe and investigate a 
metasearch model based on Bayesian inference; ... 

Rank.aggregation„meth^^^^ 

Cynthia Dwork, Ravi Kumar, Moni Naor, D. Sivakumar 

April 2001 Proceedings of the 10th international conference on World Wide Web 
Publisher: ACM Press 

Full text available: ^.pdf(288 J5. K3) Additional Information: MLcjtation, Merences, citings. Index terrns 



Keywords: metasearch, multi-word queries, rank aggregation, ranking functions, spam 



Experiences with selecting search engines using metasearch 
Daniel Dreilinger, Adele E. Howe 

July 1997 ACM Transactions on Information Systems (TOIS), volume is issue 3 
Publisher: ACM Press 

Full text available- Ddf(4^8 65 K3 1 Additional Information: full citation, abstract, references, citings, index 



terms, review 



Search engines are among the most useful and high-profile resources on the Internet. 
The problem of finding information on the Internet has been replaced with the problem of 
knowing where search engines are, what they are designed to retrieve, and how to use 
them. This article describes and evaluates SavvySearch, a metasearch engine designed to 
intelligently select and interface with multiple remote search engines. The primary 
metasearch issue examined is the importance of carefully selecti ... 

Keywords: WWW, information retrieval, machine learning, search engine 



6 Search 1 : Expert agreennent and content based reranking in a meta search 
& enyironment.using..Me 
^ B. Uygar Oztekin, George Karypis, Vipin Kumar 

May 2002 Proceedings of the 11th international conference on World Wide Web 

Publisher: ACM Press 

Full text available* W pdf(5{}9 92 KB) Additional Information: Ml.citatJon, abstract, references, citings, index 

Recent increase in the number of search engines on the Web and the availability of meta 
search engines that can query multiple search engines makes it important to find effective 
methods for combining results coming from different sources. In this paper we Introduce 
novel methods for reranking in a meta search environment based on expert agreement 
and contents of the snippets. We also introduce an objective way of evaluating different 
methods for ranking search results that is based upon implici ... 
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Keywords: collection fusion, expert agreennent, merging, meta search, reranking 



PQStersi.Bujjding„an.o 
A, Gulli, A. Signorini 

May 2005 Special interest tracks and posters of the 14th international conference on 
World Wide Web 

Publisher: ACM Press 

Full text available: '^pdfCI 75,5.4„KB) Additional Information: fuJLcitatipn, abstract, references, index ter-ns 

In this short paper we introduce Helios, a flexible and efficient open source meta-search 
engine, Helios currently runs on the top of 18 search engines (in Web, Books, News, and 
Academic publication domains), but additional search engines can be easily plugged in. 
We also report some performance mesured during its development. 

Keywords: meta search engines, open source 



^ jnformation,relnevak^ 
^ Mark Montague, Javed A. Aslam 

^ November 2002 Proceedings of the eleventh international conference on Information 
and knowledge management 

Publisher: ACM Press 

Full text available: '^isdfCMQJS.KB) Additional Information: fujicitatjon, abstract, referer:Ces, Gjtlnss 

We present a new algorithm for improving retrieval results by combining document 
ranking functions: Condorcet-fuse. Beginning with one of the two major classes of voting 
procedures from Social Choice Theory, the Condorcet procedure, we apply a graph- 
theoretic analysis that yields a sorting-based algorithm that is elegant, efficient, and 
effective. The algorithm performs very well on TREC data, often outperforming existing 
metasearch algorithms whether or not relevance scores and training ... 

9 information access and retrieval: Web metasearch: rank vs. score based rank 
^ aggregation nnethods 
^ M, Elena Renda, Umberto Straccia 

March 2003 Proceedings of the 2003 ACM symposium on Applied computing 

Publisher: ACM Press 

Full text available: "^.pdSgoajl KB) Additional Information: .Ml„Qltatjpn, abstract, .references, iChdex Jerms 

Given a set of rankings, the task of ranking fusion is the problem of combining these lists 
in such a way to optimize the performance of the combination. The ranking fusion 
problem is encountered in many situations and, e.g., metasearch is a prominent one. It 
deals with the problem of combining the result lists returned by multiple search engines in 
response to a given query, where each item in a result list is ordered with respect to a 
search engine and a relevance score. Several ranking ... 

Keywords: meta-search, rank list aggregation 



yisuaijnformation.re^^^^ 

Shih-Fu Chang, John R. Smith, Mandis Beigi, Ana Benitez 
December 1997 Communications of the ACM, volume 40 issue 12 
Publisher: ACM Press 

Full text available: "^pdf(l,96.MBi Additional Information: M.ciMQa, r^fer.ence^, Qjtings, jMexJeam 
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Auctions.and„E-comm.^^^^^^ 
Hemant K. Bhargava, Juan Feng 

May 2002 Proceedings of the 11th international conference on World Wide Web 
Publisher: ACM Press 

Full text available: '^pdfC2M..18.K3) Additional Information: fujj.citatipn, abstract, references, index terrns 

Internet search engines and comparison shopping have recently begun implennenting a 
paid placement strategy, where some content providers are given prominent positioning 
in return for a placement fee. This bias generates placement revenues but creates a 
disutility to users, thus reducing user-based revenues. We formulate the search engine 
design problem as a tradeoff between these two types of revenues. We demonstrate that 
the optimal placement strategy depends on the relative benefits (to provid ... 

Keywords: bias, information gatekeepers, paid placement, promotion, search engines 



An,ove[\lew.and^ 
Ruxandra Domenig, Klaus R. Dittrich 
^ September 1999 ACM SIGMOD Record, Volume 28 issue 3 
Publisher: ACM Press 

Full text available: ^.pM§§L§4 KB) Additional Information: .M.citatjon, abstract, clings, indexjerrns 

Multimedia technology, global information infrastructures and other developments allow 
users to access more and more information sources of various types. However, the 
''technical" availability alone (by means of networks, WWW, mail systems, databases, 
etc.) is not sufficient for making meaningful and advanced use of all information available 
on-line. Therefore, the problem of effectively and efficiently accessing and querying 
heterogeneous and distributed data sources is an impo ... 

''^ World. Wide. Web;„M 

Theodora Tsikrika, Mounia Lalmas 

October 2001 Proceedings of the tenth international conference on Information and 

knowledge management 
Publisher: ACM Press 

Full text available- "PlpcifM 30 MB\ Additional Information: full ciMion, abstract, rejexences, citjng.s, index 

Data fusion on the Web refers to the merging, into a unified single list, of the ranked 
document lists, which are retrieved in response to a user query by more than one Web 
search engine. It is performed by metasearch engines and their merging algorithms utilise 
the information present in the ranked lists of retrieved documents provided to them by 
the underlying search engines, such as the rank positions of the retrieved documents and 
their retrieval scores. In this paper, merging techniques are ... 

Keywords: Dempster-Shafer's theory of evidence, information retrieval, web data fusion 



Structured documents: Combining document representations for known-item search ^ 
Paul Ogilvie, Jamie Callan 

July 2003 Proceedings of the 26th annual international ACM SIGIR conference on 

Research and development in informaion retrieval 
Publisher: ACM Press 

Full text available: m pdf(200.95 K3j Additional Information: fMll citation , aMm^, rgfer^pce? . ciitaoa. Ind^ 

termA! 

This paper investigates the pre-conditions for successful combination of document 
representations formed from structural markup for the task of known-item search. As this 
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task is very similar to work in meta-search and data fusion, we adapt several hypotheses 
from those research areas and investigate them in this context. To investigate these 
hypotheses, we present a mixture-based language model and also examine many of the 
current meta-search algorithms. We find that compatible output from syst ... 

Keywords: data fusion, known-Item finding, language models, meta-search algorithms 



QAi.appjjcatlon;„Exte^^ ^ 
^ [nterfacing„with.the.open 
^ Panagiotis G. Ipeirotis, Tom Barry, Luis Gravano 

July 2002 Proceedings of the 2nd ACM/IEEE-CS joint conference on Digital libraries 

Publisher: ACM Press 

Full text available: "gj pdf{3()3.33 KB) Additional Information: full citation, abstract, references, index tern^s 

SDARTS is a protocol and toolkit designed to facilitate metasearching. SDARTS combines 
two complementary existing protocols, SDLIP and STARTS, to define a uniform interface 
that collections should support for searching and exporting metasearch-related metadata. 
SDARTS also includes a toolkit with wrappers that are easily customized to make both 
local and remote document collections SDARTS-compliant. This paper describes two 
significant ways in which we have extended the SDARTS toolkit. First, we ... 

Keywords: SDLIP, distributed searching, metadata, metasearching, web databases, 
wrapper construction 



16 Posters; A unified model for metasearch and the efficient evaluation of retrieval 

systems via the hedge algorithm 
^ Javed A. Aslam, Virgiliu Pavlu, Robert Savell 

July 2003 Proceedings of the 26th annual international ACM SIGIR conference on 
Research and development in informaion retrieval 

Publisher: ACM Press 

Full text available: "^.pdrCS^ja. KBJ Additional Information: fuii citation, abstract, refejences, irhdexMrns 

We present a unified framework for simultaneously solving both the pooling problem (the 
construction of efficient document pools for the evaluation of retrieval systems) and 
metasearch (the fusion of ranked lists returned by retrieval systems in order to increase 
performance). The implementation is based on the Hedge algorithm for online learning, 
which has the advantage of convergence to bounded error rates approaching the 
performance of the best linear combination of the underlying syst ... 



Keywords: metasearch, pooling, retrieval systems 



]ncrementai,ciuMe^^ B 
Gabriel L. Somlo, Adele E. Howe 

May 2001 Proceedings of the fifth international conference on Autonomous agents 
Publisher: ACM Press 

Full text available: Wi pdf(377.14 K3) Additional Information: fuij, citation, abstract, references, citings, index 
^ terms 

User profiles are the central component of most personalized Web information agents. 
They consist of a set of models representing the various topics of interest to the user. 
Often the agent learns the user's preferences from examples of documents deemed 
relevant to the user. The topic of the document can either be supplied by the user (active 
modeling), or it must be guessed by the agent (passive modeling), which is more 
convenient but is expected to diminish the agent's accuracy. We presen ... 
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Keywords: adaptation and learning, information agents, user modeling 



''^ Architecture.of.a.me^^ 

Eric J. Glover, Steve Lawrence, William P. Birmingham, C. Lee Giles 
^ November 1999 Proceedings of the eighth international conference on Information 
and knowledge management 

Publisher: ACM Press 

Full text available- -gg ^ odt/858 64 KB) Additional Infornnation: fyjl citation. abstracL references, citinss. Index 

terms 

When a query is submitted to a metasearch engine, decisions are made with respect to 
the underlying search engines to be used, what modifications will be made to the query, 
and how to score the results. These decisions are typically made by considering only the 
user's keyword query, neglecting the larger information need. Users with specific needs, 
such as "research papers" or "homepages," are not able to express these needs in a way 
that affects the decisions made b ... 

The consum@r„sjde.pf„s^^^^^ 
Abbe Mowshowitz, Akira Kawaguchi 

September 2002 Communications of the ACM, Volume 45 Issue 9 
Publisher: ACM Press 

Full text available: W{ DdtnS/? 95 KB) 

ffl htmi([4172K^^^ Additional Information: full citahon , ^j^stract , references , index terms 

When it comes to measuring bias on the Web, there is clearly strength in numbers (of 
search engines, that is). 

20 System components for embedded information retrieval from multiple disparate 
^ information sources 

^ Ramana Rao, Daniel M. Russell, Jock D. Mackinlay 

December 1993 Proceedings of the 6th annual ACM symposium on User interface 

software and technology 
Publisher: ACM Press 

Full text available: ^.pdf(l 1.1 MB) Additional Information: .ful],cjtatiQn, references, citings, jndex terms 
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PpstersiScholars.port 
Krisellen Maloney, John R. James 
^ June 2004 Proceedings of the 4th ACM/IEEE-CS joint conference on Digital libraries 
Publisher: ACM Press 

Full text available: ^.p,d?,9A47.KBj Additional Information: .Ml.cjtatjon, abstract, reMences, Mexlerrns 

The Scholars Portal Project is a collaborative venture joining seven ARL Libraries and a 
software vendor to develop an integrated web— based system that will connect 
researchers, instructors and students with appropriate, vetted information resources. The 
Project's initial focus has been on the meta— search discovery" and direct linking 
"delivery" tools that provide the software and metadata foundation for the Scholars Portal 
This paper will provide the current progress of implementation and pla ... 



Keywords: metasearch, scholars portal 



22 S„eMion„1ALQompaang.to | 
Ronald Fagin, Ravi Kumar, D. Sivakumar 

January 2003 Proceedings of the fourteenth annual ACM-SIAM symposium on 
Discrete algorithms 

Publisher: Society for Industrial and Applied Mathematics 

Full text available- ' Podf{1.02 M3) Additional Information: .M citation, abstract, references, dtings, .Index 

Motivated by several applications, we introduce various distance measures between "top k 
lists." Some of these distance measures are metrics, while others are not. For each of 
these latter distance measures: we show that it is "almost" a metric in the following two 
seemingly unrelated aspects:step-(i) it satisfies a relaxed version of the polygonal (hence, 
triangle) inequality, andstep-(ii) there is a metric with positive constant multiples that 
bounds our measure above and below.This is ... 

23 jnformation.acc.^^^^^^ | 
^ merging. 

^ Shengli Wu, Fabio Crestani 

March 2004 Proceedings of the 2004 ACM symposium on Applied computing 
Publisher: ACM Press 

Full text available: "P pdfd 76.76 KB) Additional Infornnation: full citation, abstract, references, index terms 
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In distributed information retrieval systems, document overlaps occur frequently across 
results from different databases. This is especially the case for meta-search engines which 
merge results from several general-purpose web search engines. This paper addresses 
the problem of merging results which contain overlaps in order to achieve better 
performance. Several algorithms for merging results are proposed, which take advantage 
of the use of duplicate documents in two ways: one correlates scores ... 

Keywords: data fusion, information retrieval 



24 PERSIVAL, a system for personalized search and summarization over multimedia 




healthcare information 

Kathleen R. McKeown, Shih-Fu Chang, James Cimino, Steven Feiner, Carol Friedman, Luis 
Gravano, Vasileios Hatzivassiloglou, Steven Johnson, Desmond A. Jordan, Judith L. Klavans, 
Andre Kushniruk, Vimla Patel, SimoneTeufel 

January 2001 Proceedings of the 1st ACM/IEEE-CS joint conference on Digital 

libraries 
Publisher: ACM Press 

Full text available- pdt^369 13 KS^ Additional Information: fuH citation, abstract, references, citings, index 
^'^'^^^ terms 

In healthcare settings, patients need access to online information tha t can help them 
understand their medical situation. Physicians need information that is clinically relevant 
to an individual patient. In this paper, we present our progress on developing a system, 
PERSIVAL, that is designed to provide personalized access to a distributed patient care 
digital library. Using the secure, online patient records at New York Presbyterian Hospital 
as a user model, PERSIVAL's components tailor s ... 

Keywords: medical digital library, multimedia, natural language, personalization, query 
interface, search, summarization 



iQwards.a.hlghly-sca 

Zonghuan Wu, Weiyi Meng, Clement Yu, Zhuogang Li 

April 2001 Proceedings of the 10th international conference on World Wide Web 
Publisher: ACM Press 

Full text available: ^pdfQiS J„8„KB) Additional Information: full, citation, references, ciftngis, index.terms 



Keywords: database selection, distributed text database, metasearch engine, resource 
discovery 



Efficient and effective metasearch for a large number of text databases 
Clement Yu, Weiyi Meng, King-Lup Liu, Wensheng Wu, Naphtali Rishe 
November 1999 Proceedings of the eighth international conference on Information 

and knowledge management 
Publisher: ACM Press 

Full text available- Wi pdf^ l 04 MS) Additional Information: full citation , abstract, references, citings, Index 
^ " terms 

Metasearch engines can be used to facilitate ordinary users for retrieving information fronri 
multiple local sources (text databases). In a metasearch engine, the contents of each 
local database is represented by a representative. Each user query is evaluated against 
the set of representatives of all databases in order to determine the appropriate 
databases to search. When the number of databases is very large, say in the order of tens 
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of thousands or more, then a traditional metasearch engin ... 

27 Building efficient and effective metasearch engines 

Weiyi Meng, Clement Yu, King-Lup Liu 
^ March 2002 ACM Computing Surveys (CSUR), Volume 34 issue i 

Publisher: ACM Press 

Full text available: p pdf^416.07 KB) Additional Information: full citation , abstract, references, citings, index 
^ ' terms 

Frequently a user's information needs are stored in the databases of multiple search 
engines. It is inconvenient and inefficient for an ordinary user to invoke multiple search 
engines and identify useful documents from the returned results. To support unified 
access to multiple search engines, a metasearch engine can be constructed. When a 
metasearch engine receives a query from a user, it invokes the underlying search engines 
to retrieve useful information for the user. Metasearch engines have ... 

Keywords: Collection fusion, distributed collection, distributed information retrieval, 
information resource discovery, metasearch 



DemonMrMLQas:.CMe^^ 

Yilu Zhou, Jialun Qin, Hsinchun Chen, Zan Huang, Yiwen Zhang, Wingyan Chung, Gang Wang 
May 2003 Proceedings of the 3rd ACM/IEEE-CS joint conference on Digital libraries 
Publisher: IEEE Computer Society 

Full text available: ^.pdf(106:S3 K3) Additional Information: Ml citation, abstract, reMences, indexjen-ns 

CMedPort is a cross-regional Chinese medical Web portal developed in the AI Lab at the 
University of Arizona. We will demonstrate the major system functionalities. 

2^ Lgaminato.find„ansmc^^^^ 
^ Eugene Agichtein, Steve Lawrence, Luis Gravano 

May 2004 ACM Transactions on Internet Technology (TOIT), volume 4 issue 2 

Publisher: ACM Press 

Full text available: ^ndt'!4.49 MB) Additional Information: full citation, abstract, references . Index terms 

We introduce a method for learning to find documents on the Web that contain answers to 
a given natural language question. In our approach, questions are transformed into new 
queries aimed at maximizing the probability of retrieving answers from existing 
information retrieval systems. The method involves automatically learning phrase features 
for classifying questions into different types, automatically generating candidate query 
transformations from a training set of question/answer pairs, and ... 

Keywords: Web search, information retrieval, meta-search, query expansion, question 
answering 



A unified environment for fusion of information retrieval approaches 
^ M. Catherine McCabe, Abdur Chowdhury, David A. Grossman, Ophir Frieder 
^ November 1999 Proceedings of the eighth international conference on Information 
and knowledge management 

Publisher: ACM Press 

Full text available: Wi pdf(566,22 K3] Additional Information: fg|| citation, 5!l3gtrgct. references, citings, In dex 

Prior work has shown that combining results of various retrieval approaches and query 
representations can improve search effectiveness. Today, many meta-search engines 
exist which combine the results of various search engines in the hopes of improving 
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overall effectiveness. However, the combination of results from different search engines 
masks variations in parsers, and other Indexing techniques (stemming, stop words, etc.) 
This makes it difficult to assess the utility of the fusion techni ... 

Keywords: fusion, information retrieval, metasearch, retrieval, text 



ExteMing..Matchmaki^ 
Mario Gomez, Enric Plaza 

July 2004 Proceedings of the Third International Joint Conference on Autonomous 

Agents and Multiagent Systems - Volume 1 
Publisher: IEEE Computer Society 

Full text available: ^ 88 KS) Additional Information: full citiation . abstract, index terms 

This paper describes an extension of semantic matchmaking that alms at maximizing the 
reuse of agent capabilities over new application domains. Our approach is to use an Agent 
Capability Description Language (ACDL) not only to describe the requests and the 
advertised capabilities, but also to describe the domain-models characterizing specific 
application domains. The description of tasks and capabilities is Independent of any 
particular domain, though a capability can specify the knowledge requi ... 

PoMers;..RankaggregMion.fo 
^ Ka Wal Lam, Chi Ho Leung 

^ May 2004 Proceedings of the 13th international World Wide Web conference on 
Alternate track papers & posters 

Publisher: ACM Press 

Full text available: ^.pdf(158,92 Kg) Additional Information: ML.QMtion, abstract, .references, indexienris 

In this paper, we present an algorithm for merging results from different data sources in 
meta-search engine. We further extend one that has developed for ranking players of a 
round-robin tournament to a more general one when the ranking Input is given from 
multiple sources. The problem in meta-search engine can be represented by a complete 
directed graph which can be used by the Majority Spanning Tree (MST) algorithm. It Is 
useful especially when the system must integrate and merge the query re ... 

Keywords: meta-search engines, rank aggregation 



A hishixscaiabJe.andMe^^^ | 
^ Welyl Meng, Zonghuan Wu, Clement Yu, Zhuogang Li 

July 2001 ACM Transactions on Information Systems (TOIS), Volunrie 19 Issue 3 
Publisher: ACM Press 

Full text available- ■f??|_pd"553 63 KS) Additional Information: t'ull citation, abstract, references, citinos. index 
^ " terms 

A metasearch engine Is a system that supports unified access to multiple local search 
engines. Database selection is one of the main challenges in building a large-scale 
metasearch engine. The problem is to efficiently and accurately determine a small number 
of potentially useful local search engines to invoke for each user query. In order to enable 
accurate selection, metadata that reflect the contents of each search engine need to be 
collected and used. This article proposes a highly scalable ... 

Keywords: Database selection, distributed text retrieval, metasearch engine, resource 
discovery 
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Efficient and effective metasearch for text databases incorporating linkages among 
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documents B 
Clement Yu, Weiyi Meng, Wensheng Wu, King-Lup Liu 

May 2001 ACM SIGMOD Record , Proceedings of the 2001 ACM SIGMOD international 

conference on Management of data SIGMOD '01, volume so issue 2 
Publisher: ACM Press 

Full text available* W\ pdf(245 72 KB) A^^'t'Q"^' Information: MLcjtatipn, abstract, reMence^, citings, index 

Linkages among documents fiave a significant impact on the importance of documents, as 
it can be argued that important documents are pointed to by many documents or by other 
important documents. Metasearch engines can be used to facilitate ordinary users for 
retrieving information from multiple local sources (text databases). There is a search 
engine associated with each database. In a large-scale metasearch engine, the contents 
of each local database is represented by a representative. Each u ... 

Keywords: distributed collection, information retrieval, linkages among documents, 
metasearch 



infQrmMon.source sete 
Demet Aksoy 
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Distributed information retrieval has pressing scalability concerns due to the growing 
number of independent sources of on-line data and the emerging applications. A 
promising solution to distributed retrieval is metasearching, which dispatches a user's 
query to multiple sources and gathers the results into a single result set. An important 
component of metasearching is selecting the set of information sources most likely to 
provide relevant documents. Recent research has focused on how to obtai ... 

3^ Ranking: Comparing and aaaregating rankings with ties 
^ Ronald Fagin, Ravi Kumar, Mohammad Mahdian, D. Sivakumar, ErikVee 
^ June 2004 Proceedings of the twenty-third ACM SIGMOD-SIGACT-SIGART 
symposium on Principles of database systems PODS '04 
Publisher: ACM Press 

Full text available: pdf(213.90 K3) Additional Information: full citation, abstract, references, citings 

Rank aggregation has recently been proposed as a useful abstraction that has several 
applications, including meta-search, synthesizing rank functions from multiple indices, 
similarity search, and classification. In database applications (catalog searches, fielded 
searches, parametric searches, etc.), the rankings are produced by sorting an underlying 
database according to various fields. Typically, there are a number of fields that each 
have very few distinct values, and hence the corresponding ... 

Posters: Testbed for information extraction from deep web 

Yasuhiro Yamada, Nick Craswell, Tetsuya Nakatoh, Sachio Hirokawa 

May 2004 Proceedings of the 13th international World Wide Web conference on 

Alternate track papers & posters 
Publisher: ACM Press 

Full text available: 'g^ pdf(24.74 KB) Additional Information: full citation, abstract, references, index terms 

Search results generated by searchable databases are served dynamically and far larger 
than the static documents on the Web, These results pages have been referred to as the 
Deep Web. We need to extract the target data in results pages to integrate them on 
different searchable databases. We propose a test bed for information extraction from 
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search results. We chose 100 databases randomly from 114,540 pages with search forms. 
Therefore, these databases have a good variety. We selected 51 database ... 

Keywords: deep web, meta search, testbed, wrapper 
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In this paper we describe how we combined SDQP and STARTS, two comple mentary 
protocols for searching over distributed document collections. The resulting protocol, 
which we call SDARTS, is simple yet expressible enough to enable building sophisticated 
metasearch engines. SDARTS can be viewed as an instantiation of SDUP with 
metasearch-specific elements from STARTS. We also report on our experience building 
three SDARTS-compliant wrappers: for locally available plain-text document collect ... 



Posters;„MetacrysM^^^ 




^ engines 
Anselm Spoerri 

May 2004 Proceedings of the 13tii international World Wide Web conference on 
Alternate track papers & posters 

Publisher: ACM Press 
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MetaCrystal enables users to visualize and control the degree of overlap between the 
results returned by different search engines. Several linked overview tools support rapid 
exploration, facilitate complex filtering operations and guide users toward relevant 
information. MetaCrystal addresses the problem of the effective fusion of different search 
results by helping users to visually combine and filter the top results returned by the 
different engines. Users can apply weights to the search engi ... 



Keywords: information visualization, meta searching 
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^ July 2000 Proceedings of the 23rd annual international ACM SIGIR conference on 
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Publisher: ACM Press 

Full text available- "iipdf^??.! 77 K«) Information: .fulJ..Gltatjon, sbstract, reMepces, citing.s, index 

^ ^ terms 

Most web pages are linked to others with related content. This idea, combined with 
another that says that text in, and possibly around, HTML anchors describe the pages to 
which they point, is the foundation for a usable World-Wide Web. In this paper, we 
examine to what extent these ideas hold by empirically testing whether topical locality 
mirrors spatial locality of pages on the Web. In particular, we find that the likelihood of 
linked pages having similar textual content to be ... 
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41 Web information Retrievai: Using sampled data and regression to merge search Q 
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^ Luo Si, Jamie Callan 

August 2002 Proceedings of the 25th annual international ACM SIGIR conference on 
Research and development in information retrieval 

Publisher: ACM Press 

Full text available- fi DdK263 96 KBi Additional Information: fgi! citatjon. abstract, references, citings, index 
^ terms 

This paper addresses the problem of merging results obtained from different databases 
and search engines in a distributed information retrieval environment. The prior research 
on this problem either assumed the exchange of statistics necessary for normalizing 
scores (cooperative solutions) or Is heuristic. Both approaches have disadvantages. We 
show that the problem in uncooperative environments is simpler when viewed as a 
component of a distributed IR system that uses query-based sampling to cr ... 

Keywords: distributed information retrieval, regression, results merging 
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March 2004 ACM Computing Surveys (CSUR), volume 36 issue i 
Publisher: ACM Press 

Full text available: 'g pdf(294.13 KB) Additional Information: full citation, abstract, references , index ter 



With the explosive growth of the World Wide Web, the public is gaining access to massive 
amounts of information. However, locating needed and relevant information remains a 
difficult task, whether the information is textual or visual. Text search engines have 
existed for some years now and have achieved a certain degree of success. However, 
despite the large number of images available on the Web, image search engines are still 
rare. In this article, we show that in order to allow people to profi ... 

Keywords: Image-retrieval, World Wide Web, crawling, feature extraction and selection, 
indexing, relevance feedback, search, similarity 
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The proliferation of online information resources increases the importance of effective and 
efficient information retrieval In a multicollection environment. Multicollection searching is 
cast in three parts: collection selection (also referred to as database selection), query 
processing and results merging. In this work, we focus our attention on the evaluation of 
the first step, collection selection. In this article, we present a detailed discussion of the 
methodology that we used to evaluate an ... 

Keywords: Collection selection, database selection, distributed information retrieval, 
distributed text retrieval, metasearch engine, resource discovery, resource ranking, 
resource selection, server ranking, server selection, text retrieval 
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Full text available: l|x)dfM92.98 KB^. Additional Information: full citation. absiracL f^ferences, diingi. index 
* " terms 

When searching the WWW, users often desire results restricted to a particular document 
category. Ideally, a user would be able to filter results with a text classifier to minimize 
false positive results; however, current search engines allow only simple query 
modifications. To automate the process of generating effective query modifications, we 
introduce a sensitivity analysis-based method for extracting rules from nonlinear support 
vector machines. The proposed method allows the user to specify ... 

Keywords: query modification, rule extraction, sensitivity analysis, support vector 
machine 
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^ June 1997 ACM SIGMOD Record , Proceedings of the 1997 ACM SIGMOD international 
conference on Management of data SIGMOD '97, volume 26 issue 2 
Publisher: ACM Press 

Full text available: " ^DdftlSSMB) Additional Information: full citation, abstrgiot , references , citings , index 

Document sources are available everywhere, both within the internal networks of 
organizations and on the Internet. Even individual organizations use search engines from 
different vendors to index their internal document collections. These search engines are 
typically incompatible in that they support different query models and interfaces, they do 
not return enough information with the query results for adequate merging of the results, 
and finally, in that they do not export metadata about t ... 

*^ XLibris: an automated library research assistant 

Andrew Crossen, Jay Budzik, Mason Warner, Larry Birnbaum, Kristian J. Hammond 
^ January 2001 Proceedings of the 6th international conference on Intelligent user 
interfaces 

Publisher: ACM Press 

Full text available: 11J pdf(209 48 KBJ Additional Information: MLcitatipn, abstract, Merences, citings, index 
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While recent work has focused on providing tools and Infrastructure for users to access 
electronic information over the Internet, the relationship between the physical world and 
infornnation available online has been relatively unexplored. Information about a user's 
location, and the objects she interacts with, can be sufficient to recognize enough of the 
user's task to drive retrieval of online information relevant to the task at hand. The XLibris 
system automatically retrieves, aggregates ... 

Keywords: automated retrieval, information aggregation, metasearch, ubiquitous 
computing 
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The proliferation of searchable text databases on local area networks and the Internet 
causes the problem of finding information that may be distributed among many disjoint 
text databases {distributed information retrieval). How to merge the results returned by 
selected databases is an important subproblem of the distributed information retrieval 
task. Previous research assumed that either resource providers cooperate to provide 
normalizing statistics or search clients download all retrie ... 

Keywords: Distributed information retrieval, resource ranking, resource selection, results 
merging, semisupervised learning method, server selection 
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Publisher: ACM Press 

Full text available: mp«9.29 KBj Additional Information: full citation , abstract, references, dtings, Index 
" ^ terms 

As the Web has been growing exponentially, it has become increasingly difficult to search 
for desired information. In recent years, many domain-specific (vertical) search tools have 
been developed to serve the information needs of specific fields. This paper describes two 
approaches to building a domain-specific search tool. We report our experience in building 
two different tools in the nanotechnology domain — (1) a server-side search engine, and 
(2) a client-side search agent. The designs of ... 

Keywords: indexing, information retrieval, internet searching and browsing, internet 
spider, noun-phrasing, personalization, post-retrieval analysis, self-organizing map, 
summarization, vertical search engine, web search engine 
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Database selection is an Important step wfien searching over large numbers of distributed 
text databases. The database selection task relies on statistical summaries of the 
database contents, which are not typically exported by databases. Previous research has 
developed algorithms for constructing an approximate content summary of a text 
database from a small document sample extracted via querying. Unfortunately, Zipf s law 
practically guarantees that content summaries built this way for any rela ... 
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^ August 2002 Proceedings of the 25th annual international ACM SIGIR conference on 
Research and development in information retrieval 
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This poster describes initial work exploring a relatively unexamined area of data fusion: 
fusing the results of retrieval systems whose collections have no overlap between them. 
Many of the effective meta-search/data fusion strategies gain much of their success from 
exploiting document overlap across the source systems being merged. When the 
intersection of the collections is the empty set, the strategies generally degrade to a 
simpler form. In order to address such situations, two strategies we ... 
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Publisher: IBM Press 

Full text available: ^ f)df(253.28 KB) Additional Information: full citation, abstract, references, index terms 

The research described in this paper examined two different approaches to building the 
co-citation network that the authors have used in re-ranking the set of results returned by 
a search engine [22, 23]. The more computationally demanding (in terms of query load) 
Inter- or Web-wide co-citation approach used in-links from throughout the Web to build 
the network. In contrast, the Intra co-citation approach only used inlinks Inferred from 
search engine output. Results of this study confirmed th ... 
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Eduard Hovy, Wessel Kraaij, John Lafferty, Victor Lavrenko, David Lewis, Liz LIddy, R. 
Manmatha, Andrew McCallum, Jay Ponte, John Prager, Dragomir Radev, Philip Resnik, 
Stephen Robertson, Roni Rosenfeld, Salim Roukos, Mark Sanderson, Rich Schwartz, Amit 
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November 2002 Proceedings of the eleventh international conference on Information 

and knowledge management 
Publisher: ACM Press 

Full text available: p pdr^285.83 KBl Additional Information: full citation , abstract, references, cmnas. index 
^ terms 

Current web searcli engines are built to serve all users, independent of the needs of any 
individual user. Personalization of web search is to carry out retrieval for each user 
incorporating his/her interests. We propose a novel technique to map a user query to a 
set of categories, which represent the user's search intention. This set of categories can 
serve as a context to disambiguate the words In the user's query. A user profile and a 
general profile are learned from the user's search history ... 

Keywords: category hierarchy, information filtering, personalization, search engine 
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Significant efforts are being made to digitize rare and valuable library materials, with the 
goal of providing patrons and historians digital facsimiles that capture the "look and feel" 
of the original materials. This is often done by digitally photographing the materials and 
making high resolution 2D images available. The underlying assumption is that the 
objects are flat. However, older materials may not be flat In practice, being warped and 
crinkled due to decay, neg ... 

Keywords: World Wide Web, distributed information retrieval, effectiveness evaluation, 
server selection 
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In this paper the score distributions of a number of text search engines are modeled. It is 
shown empirically that the score distributions on a per query basis may be fitted using an 
exponential distribution for the set of non-relevant documents and a normal distribution 
for the set of relevant documents. Experiments show that this model fits TREC-3 and 
TREC-4 data for not only probabilistic search engines like INQUERY but also vector space 
search engines like SMART for English. We have als ... 



57 



Using query mediators for distributed searching in federated digital libraries 



u 



http://portal.acm.org/resultsxfm?query=title%3A%20%2Bmetasearch%20or%20%2B%22m... 2/15/06 



Results (page 3): title: +metasearch or +"meta search" 



Page 6 of 7 



Naomi Dushay, James C. French, Carl Lagoze 
\ August 1999 Proceedings of the fourth ACM conference on Digital libraries 
Publisher: ACM Press 

Full text available: ■^pdf(M.29. KB). Additional Information: fujj. citation, references, cjtlnas, jMex.terms 



Keywords: distributed searching, query routing, search engine performance 



^® internet data management i IDM): Learning query ianquaaes of Web interfaces 
\^ Andre Bergholz, Boris Chidlovslcli 

^ March 2004 Proceedings of the 2004 ACM symposium on Applied computing 
Publisher: ACM Press 

Full text available: ?xjft2S3.16 KB) Additional Infornnatlon: flili citation, abstract, reterei^ces 

This paper studies the problem of automatic acquisition of the query languages supported 
by a Web information resource. We describe a system that automatically probes the 
search interface of a resource with a set of test queries and analyses the returned pages 
to recognize supported query operators. The automatic acquisition assumes the 
availability of the number of matches the resource returns for a submitted query. The 
match numbers are used to train a learning system and to generate classific ... 

Keywords: hidden Web, learning, query operators, search interface 
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Keyword-based search engines are in widespread use today as a popular means for Web- 
based information retrieval. Although such systems seem deceptively simple, a 
considerable amount of skill is required in order to satisfy non-trivial information needs. 
This paper presents a new conceptual paradigm for performing search in context, that 
largely automates the search process, providing even non-professional users with highly 
relevant results. This paradigm is implemented in practice in the Intelli ... 

Keywords: Search, context, invisible web, semantic processing, statistical natural 
language processing 
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In meta-searchers accessing distributed Web-based information repositories, performance 
is a major issue. Efficient query processing requires an appropriate caching mechanism. 
Unfortunately, standard page-based as well as tuple-based caching mechanisms designed 
for conventional databases are not efficient on the Web, where keyword-based querying is 
often the only way to retrieve data. In this work, we study the problem of semantic 
caching of Web queries and develop a caching mechanism for conjun ... 

Keywords: Experiments, Query algorithms. Region containment, Semantic caching, 
Signature files 
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Emerging peer-to-peer computing provides new possibilities but also challenges for 
distributed applications. Despite their significant potential, current peer-to-peer networks 
lack efficient knowledge discovery and management. This paper addresses this deficiency 
and proposes the Intelligent File Sharing framework, which provides an effective and 
flexible query for P2P file sharing. The IFS is based on powerful schema and flexible 
inference, as well as efficiently integrated and extensible retri ... 

Keywords: association rules, encoding, hierarchy, peer-to-peer file sharing, reasoning, 
retrieval, search 
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Searching over heterogeneous information sources is difficult in part because of the 
nonuniform query languages. Our approach is to allow users to compose Boolean queries 
in one rich front-end language. For each user query and target source, we transform the 
user query Into a subsuming query that can be supported by the source but that may 
return extra documents. The results are then processed by a filter query to yield the 
correct final results. In this article we introduce the architectur ... 

Keywords: Boolean queries, content-based retrieval, filtering, predicate rewriting, query 
subsumption, query translation 
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We propose to use the community structure of Usenet for organizing and retrieving the 
information stored in newsgroups. In particular, we study the network formed by cross- 
posts, messages that are posted to two or more newsgroups simultaneously. We present 
what is, to our knowledge, by far the most detailed data that has been collected on 
Usenet cross-postings. We analyze this network to show that it is a small-world network 
with significant clustering. We also present a spectral algorithm which ... 

Keywords: clustering, spectral method, Usenet 
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Combining retrieval results from multiple modalities plays a crucial role for video retrieval 
systems, especially for automatic video retrieval systems without any user feedback and 
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query expansion. However, most of current systems only utilize query independent 
combination or rely on explicit user weighting. In this work, we propose using query-class 
dependent weights within a hierarchial mixture-of-expert framework to combine multiple 
retrieval results. We first classify each user query int ... 

Keywords: learning, modality fusion, query class, video retrieval 
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terois 

Recent work in data integration has shown the importance of statistical information about 
the coverage and overlap of sources for efficient query processing. Despite this 
recognition there are no effective approaches for learning the needed statistics. The key 
challenge in learning such statistics is keeping the number of needed statistics low enough 
to have the storage and learning costs manageable. Naive approaches can become 
infeasible very quickly. In this paper we present a set of connected ... 
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With the phenomenal growth of the Web, there is an everincreasing volume of data and 
information published in numerous Web pages. The research in Web mining aims to 
develop new techniques to effectively extract and mine useful knowledge or information 
from these Web pages [8]. Due to the heterogeneity and lack of structure of Web data, 
automated discovery of targeted or unexpected knowledge/information is a challenging 
task. It calls for novel methods that draw from a wide range of fields spanni ... 
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Distributed IR systems query a large number of IR servers, merge the retrieved results 
and display them to users. Since different servers handle collections of different sizes, 
have different processing and bandwidth capacities, there can be considerable 
heterogeneity in their response times. The broker in the distributed IR system thus has to 
make decisions regarding terminating searches based on perceived value of waiting 
retrieving more documents — and the costs imposed on users by waitin ... 

Keywords: distributed IR, optimal wait time, query termination, utility theory 
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We describe an architecture for building speech-enabled conversational agents, deployed 
as self-contained Web services, with ability to provide inference processing on very large 
knowledge bases and its application to voice enabled chatbots in a virtual storytelling 
environnnent. The architecture integrates inference engines, natural language pattern 
matching components and story-specific informatipn extraction from RDF/XML files. Our 
Web interface is dynamically generated by server side agents s ... 
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natural language and speech processing, virtual storytelling 



GjO.SS:.text~source^ Q 




Luis Gravano, Hector Garcia-Molina, Anthony Tomasic 
June 1999 ACM Transactions on Database Systems (TODS), volume 24 issue 2 



Publisher: ACM Press 

Full text available: f!^ pdtK-?30 37 KB) Additional Information: M.citation, a^jstrad:, fefereaces, citings, index 
' ^ term s, review 

The dramatic growth of the Internet has created a new problem for users: location of the 
relevant sources of documents. This article presents a framework for (and experimentally 
analyzes a solution to) this problem, which we call the text-source discovery problem. Our 
approach consists of two phases. First, each text source exports its contents to a 
centralized service. Second, users present queries to the service, which returns an 
ordered list of promising text sources. T ... 

Keywords: Internet search and retrieval, digital libraries, distributed information 
retrieval, text databases 
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A virtual database system is software that provides unified access to multiple information 
sources. If the sources are overlapping in their contents and independently maintained, 
then the likelihood of inconsistent answers is high. Solutions are often based on ranking 
(which sorts the different answers according to recurrence) and on fusion (which 
synthesizes a new value from the different alternatives according to a specific formula). In 
this paper we argue that both method ... 

DMribuj,edj„ModeJ Q 
]^ Luo Si, Jamie Callan 

^ August 2005 Proceedings of the 28th annual international ACM SIGIR conference on 
Research and development in information retrieval SIGIR '05 

Publisher: ACM Press 

Full text available: ^.p.d?,266,94 KB) Additional Infornnation: Ml citatjon, abslract, references, Mex.terrns 



http://portal.acm.org/resultsxfm?query=title%3A%20%2Brnetasearch%20or%20%2By^^ 2/15/06 



Results (page 4): title: +metasearch or +"meta search" 



Page 5 of 6 



Federated search links multiple search engines into a single, virtual search system. Most 
prior research of federated search focused on selecting search engines that have the most 
relevant contents, but ignored the retrieval effectiveness of individual search engines. 
This omission can cause serious problems when federating search engines of different 
qualities.This paper proposes a federated search technique that uses utility maximization 
to model the retrieval effectiveness of each search engi ... 

Keywords: model search engine effectiveness 
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The Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH) began as an 
alternative to distributed searching of scholarly eprint repositories. The nnodel embraced 
by the OAI-Pi^H is that of metadata harvesting, where value-added services (by a 
"service provider") are constructed on cached copies of the metadata extracted from the 
repositories of the harvester's choosing. While this model dispenses with the well known 
problems of distributed searching, it introduces the problem of synch ... 
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Prior research under a variety of conditions has shown the CORI algorithm to be one of 
the most effective resource selection algorithms, but the range of database sizes studied 
was not large. This paper shows that the CORI algorithm does not do well in 
environments with a mix of "small" and "very large" databases. A new resource selection 
algorithm is proposed that uses information about database sizes as well as database 
contents. We also show how to acquire database size estimates in uncoopera ... 
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The Alexandria Digital Earth ProtoType (ADEPT) architecture Is a framework for building 
distributed digital libraries of georeferenced information. An ADEPT system comprises one 
or more autonomous libraries, each of which provides a uniform interface to one or more 
collections, each of which manages metadata for one or more items. The primary 
standard on which the architecture is based is the ADEPT bucket framework, which 
defines uniform client-level metadata query services that are compatible w ... 



http://portal.acm.org/resultsxfm?query=title%3A%20%2Bmetasearch%20or%20%2B%22m 2/15/06 



Results (page 4): title: +metasearch or +"meta search" 



Page 6 of 6 



Keywords: bucket framework, collection discovery, distribution, interoperability, 
metadata 



Web techngJ.pgiesLU 
Gabriel L. Somlo, Adele E. Howe 
^ July 2003 Proceedings of the second international joint conference on Autonomous 
agents and multiagent systems 
Publisher: ACM Press 

Full text available: "^pdfQM-M-KBJ Additional Information: fuJJ.cjtation, abstraGt, refereQoes, index ter-TrS 

Personalized information agents can help overcome some of the limitations of communal 
Web information sources such as portals and search engines. Two important components 
of these agents are: user profiles and information filtering or gathering services. Ideally, 
these components can be separated so that a single user profile can be leveraged for a 
variety of information services. Toward that end, we are building an information agent 
called SurfAgent; in previous studies, we have develope ... 
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